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ABSTRACT 

A comparison of four procedures for esti mating common 
factor measurements was made using artificially synthesized "data" 
matrices. Score estimates were compared with respect to how well they 
approximated associated true factor scores and the extent of 
shrinkage in double cross-validation based on random samples. The 
Horst (1965), Bartlett (1937), and Anderson and Rubin (1956) methods 
gave what was judged as satisfactory estimates for the (artificial) 
populations of data. The cross- validationax procedures showed the 
Horn ( 1965) method to yield highly unstable estimates. It was 
concluded that the method of using columns of the factor loading 
matrix as weights to be used in estimating factor measurements cannot 
be recommended for general applications since this procedure 
consistently provided highly unstable estimates. (Author) 
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The problem of factor score estimation may be introduced by discussing briefly 

the classical factor analysis model, which may be expressed in matrix terras as: 

Z' = PS 1 + US' (1) 
c " 

where Z 8 = (pxN) raw data matrix scaled to have column means of zero and column 

standard deviations of unity, where p 0 the number of observed variatec 
and N ■ the number of individuals or entities. 

F = (pxm) f; or-loading matrix for n derived common factors. 

S f = (inxH) matrix o. dividual scores on the derived common factors, usually 
c referred to as the conmon factor score matrix. 

U = (pxp) diagonal matrix whose non-zero entries identify standard deviations 
of the derived uniqueness variable? , 

S ? = (pxN) matrix of individual score, on the associated unique factors, us- 
ually referred to as the uniqueness factor score matrix. 

The model may be classified as an additive, linear model (Thurstone, 1947); 

equation (1) indicates that data can be represented as a sun of common portions 

(FS 1 ) and unique portions (US'). In general, the derived common factors may be corr- 
c u 

elated or uncorrelatedo The derived uniqueness variables are assumed to be uncorrela- 
ted among themselves and with the common variables. These assumptions may be repre- 
sented algebraically in matrix terms as: S"S 23 I, and S'S = 0. For present purposes 

U U q \x 

it will also be reasonable to treat common factors as mutually uncorrelated: that is, 

S'S - I. 
c c 

Common portions of data, i,e., FS' in equation (1) are in general unobservable. 

c 

Hence, at best it is possible to estimate common factor scores. Geometrically the 
problem of estimation becomes apparent. The usual geometric mod \ for factor analysis 
represents the t> tests as a bundle of unit-length vectors embedded in an N-dimension- 
al Euclidean space. The derived common factors are represented as a set of m linear- 
ly independent vectors embedded in the same H-space. But these common f actorvectors 
are outside the space determined by the origin and the end points of the test vec- 
tors. Estimates of coranon factor scores which are based on observed variates are thus 
often poor because they are based on information within the space of test variables 
£'"2erman, 1964 ;Thomson, 1951) . 
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Theoretical speculation has led to the construction of several methods for es- 
timating f jctor and component scores. These procedures have been largely developed 
within a least-squares regression framework. That is, several investigators have 
approached the problem of estimating common factor scores by first estimating ]? and 

U for some selection of m less than £, and then deriving t'ue factor measurements 

e 2 
using some form of least-squares analysis treating the estimates of ¥_ and 1U as 

though they were equal to corresponding population values (see Horst, 1965). Since 

in equation (1) equals the (Nxm) matrix o£ common factor score measurements, let 
;S represent the (Nxm) matrix of corresponding estimated factor scores, and B top- 
resent the (pxm) matrix of estimation or repression weights. The general problem of 
estimation for these different least-squares methods may be expressed in matrix terras 
as: ■ ZB, where IJ depends on 1? and U alone, and is chosen in the case of each 
factor score estimation procedure to minimize certain errors of estimation. 

McDonald and Burr (1967) presented the formal properties of four least-squares 
estimation methods and compared these properties with respect to four generally de- 
sirable properties of estimated factor scores. In essence they developed the ration- 
ale for estimation and described the differences among estimates given by these pro- 
cedures with respect to theoretical criteria such as orthogonality, univocality, and 
conditional unbiasedness, which are discussed below. Harris (1967) discussed these 
same procedures and added a fifth (relatively crude) procedure that is often used or 
recommended, but is not generally considered as a standard method of estimation. 

The four desiraole properties of estimated factor scores which were discussed 
by McDonald and Burr are that: (a) estimated factor scores should approximate the 
associated true factor scores as closely as possib 1 e, i,e., the diagonal elements of 
the (mxm) matrix of cross-correlations between true and estimated scores should ap- 
proximate unity; (b) the set of m estimated score vectors should be mutually orthog- 
onal: (c) each vector of estimated factor scores should correlate zero with each vec- 
tor ot non-corresponding true factor scores (this condition identifies univocal 




tor scores; see Guilford and Michael, 1943); and (d) the estimated factor scores 
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should be conditionally unbiased estimators of corresponding true factor scores. 

Apparently, the choice of initial factoring method makes a difference in the 
estimation of factor score measurements. Harris (1967) and McDonald and Burr (1967) 
indicated that the choice of canonical factor analysis (Rao, 1955) as the initial 
factoring method produces estimated factor scores with particularly desirable prop- 
erties. Canonical factor analysis is a special case of maximum likelihood factor an- 
alysis (Joreskog, 1967 • Lawley y 1940* Rao, 1955). Browne (1968) discussed the prop- 
erties of several factor analytic techniques and made an empirical comparison of re- 
sults given by these techniques, lie indicated that in general, estimates of factor- 
loadings given by the maximum likelihood method are theoretically preferable be- 
cause they are asymptotically efficient and there is a corresponding likelihood 
ratio test for assessing the fit of the factor model. His results > together with 
those of harris and McDonald and Burr, suggest that maximum likelihood factor an- 
alysis provides a desirable basis for estimation of factor measurements. 

Trites and Sells (1955) compared the unit weighted method and the fractionally 
weighted method for estimating factor scores, using correlation coefficients. The 
two methods gave practically identical results. From the standpoint of computation, 
the unit weighted method was the simpler and it was concluded th3t this was the more 
desirable of the two methods for practical applications. Bagnaley and Cattell (1956) 
described certain exact and linear function estimates of oblique factor scores and 
discussed the conditions under which they were appropriate. They showed the extent 
of approximation in using factor-loadings in place of the exact regression weights 
in a 15 factor, 70 variable, 295 person problem. The correlations between the true 
and estimated scores ranged from .67 to .94 for the approximate procedure. Mo3eley 
and Klett (1964) empirically compared the results given by three methods of factor 
score estimation. Their results indicated that each of the methods were roughly eq- 
uivalent insofar as the intercor relations among score estimates and reliabilities 
were concerned. Horn (1965) described and empirically compared exact and approximate 
O ^cedures for estimating factor scores, using coefficients of congruence (Tucker, 
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1951). His exact procedures, i.e., those tfhich use some form of l^ast-squares anal- 
ysis, were correlated above .90 with or.e another. His approximate procedures, i.e., 
those which do not use a least-squares analysis, also correlated above .90 with one 
another. Wackwitz and Horn (1971) compared estimates given by exact and approximate 
procedures to true (population) factor scores, using a variety of criteria for com- 
parison. Their results indicated that the weighted salients method (Horn, 1965) pro- 
duced score estimates more closely matching associated true factor scores than any 
of the other methods of the study. As can be seen, there appears to have been little 
empirical research involving the estimation of individual factor measurements. 

There appears to be practical value in comparing the results given by different 
estimation procedures with respect to indices such as predictive validities and the 
amount of shrinkage to be expected when these validities are examined with cross- 
validation procedures. The aim of Lhis study was to compare four procedures for es- 
timating common factor measurements using artificially synthesized "data" matrices. 
In essence, two major questions were raised: (a) how well will the factor score es- 
timates derived using the four procedures approximate associated true factor scores 
which are available from the data simulation procedures? and (b) what will be the 
extent of shrinkage in the canonical correlations when the validities of estimates 
derived using the four estimation procedures are examined by cross-validation pro- 
cedures? 

Methodology 

Data Simulation 

Data for this study were computer simulated using the classical factor analysis 
model as represented by equation (I). Eight simple structure factor-loading matrices 
F were used to develop the common and unique portions of the population data for Us 
of 200 and 300. Each of these Fs was chosen from the factor analysis literature with 
respect to a variety of criteria such as simplicity of loadings, sizes and variabil- 
ities of communalities, and ratio of factors to variables (see Table 1). Common and 
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unique factor scores were generated for the population to fit the standard assump- 
tions of mutual and Joint orthogonality. 

INSERT TABLE 1 ABOUT HERE 



G enerating the Cross-Validation Samples 

For each combination of F and the cross-validation sample pairs served as 

the data base for the estimation of factor scores. These were determined by randomly 

splitting the (Nxp) population data matrix Z into two (nxp) halves, where n = N/2. 

For an appropriate permutation of columns, Z' ■! Z f ; Z* ' . The synthetic data Z and 

L a bj ~a 

Z were represented as Z* ■ FS\ + ITS s , (k ■ a,b). VJhile S do not in general 
~b k ck ufe - C k 

possess the (exact) properties of true (common and unique) parts described above, 
they were* nevertheless, taken as one representation of common and unique factor 
scores for the respective half samples ♦ The use of half-sample data of this form is 
not inconsistent with the approach taken in many applications of factor analysis 
where it is assumed that a population dat£ matrix fits the common factor model but 
that samples of observation vectors taken from this population may fit the model 
only approximately. 
Factoring Method 

In the case of each population of data, Z and Z were used to generate correl- 

~a ~"b 

atir i matrices which served as starting points for maximum likelihood factor analy- 
sis. Maximum likelihood factor analysis (Lawley, 1940) was chosen as the factoring 
method of this study because the theoretical properties of factor scores derived 
using this method are relatively well understood (Harris, 1967; McDonald and Burr, 
1967). Computer program UMLFA (unrestricted maximum likelihood factor analysis, Jor- 
eskog, 1966) was used to obtain the maximum likelihood estimates of factor- loadings 

and uniqueness variances used in the factor score estimation. UMLFA provided initial 

A A 

(untransformed) estimates of factor-loadings, F (k = a,b) , and orthogonally (using 

/\ 

varimax, Kaiser, 1958) transformed versions, Fj f for each selection of a number of 
^factors, m. This study used derived orthogonal solutions (Fs of the form F q T, for 
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F an untrans formed solution) because this made it possible to make direct compar- 
o 

isons of F and F to one another and to the corresponding F which was used in data 
a b 

simulation. In this sense it seems reasonable to use the sets of zero-order correla- 
tions computed between resulting matched vectors of true and estimated factor scores 
as one criterion for examining predictive validities. 
Factor-Score Estimation 

The reader will recall that for each combination of F and N, the cross- 
validation sanpl^ pairs 9 Z and Z served as the data base for factor score estima- 

-a ~b 

tion. The methods chosen for estimating factor measurements may be identified with 
respect to the following formulas: 

^ik B z kV*i V" 1 Horst (1965) <2) 

^ 2k - ^^^i^V" 1 Bartlett (1937) (3) 

Is = Z.U"" 2 ?, (b& 2 WtQ )-l/2 An derson and Rubin (1956) (4) 
3k k k k Ic k k k 

s av m Z , F Horn (1965) (5) 
4k k k 

and U (k = a,b) represent the maximum likelihood estimates of the corres- 
ponding population F and IJ in the case of each sample of data. R is the variance- 

A 

covariance (or correlation) matrix associated wi*h and F. 
A 

The S (j = l 9 2*3 s 4r k = a,b) are matrices of order (nxm) and may be express- 
Jk A 

ed alternately as ° Z E, , where the E are (pxm) matrices of estimation 

jk k jk — j k 

weights corresponding to that portion to the right of the Z matrix in (2) - (5). 

~k 

The Horst s Bartlett, and Anderson and Rubin methods were selected with refer- 
ence to the theoretical factor score properties discussed by McDonald and Burr. 
In general, regardless of initial factoring method used 9 the Horst and Bartlett es- 
timates are univocal and conditionally unbiased estimators of corresponding true 
scores. Anderson and Rubin estimates are in general orthogonal. What has been termed 
the Horn method was included in this study because it represents a quick and con- 
venient means of estimation which is sometimes used or recommended for general app- 
lications (Horn, 1965; Wackwitz and Horn, 1971). 



ERIC 



Ambrosino, p. 7 

Examination of Validities 

A. Two criteria were used to examine the quality of estimates for each method 

of estimations, in the case of each sample of data: (1) the set of m zero-order 

correlations (r ) computed between the corresponding vectors of estimated scores 
/v te 

identified as S (j - 1,2, 3 S 4; k - a,b) and true scores identified as S ; and (2) 

Jk ck 
the set of m canonical correlations (R , i ■ 1,2,..., in) computed between the (op- 

ci A 

timally) linearly weighted composite of estimated scores, and the (optimally) 

jk 

linearly weighted composite of true scores, S , For the latter 5 the (undeviated) 

r iSi ~ ck 

I *\ 2 y 1/2 

root mean squaie 5 written R11S » m f R i ' , was computed as a summary index of 

i ~ 

predictive validity. 

Bo A (double) cross-validation paradigm was employed to examine the stability 

of estimates given by the four procedures. The following notation is designed to 

facilitate an understanding of these cross-validation procedures. In the case of 

each population of data, consider Z as the h ypothesis-generat ing sample and Z as 

It Ic ' 

the corresponding valida tion sample (k ? B b if k ■ a, k' a a if k ■ b), Factor 

A. 

score estimates, S (j =» 1*2,3,4) and estimation weights, E vere calculated for 

"Jk "jk 
the initial or hypothesis-generating sample* RMS s were calculated for canonical 

A, 

correlations computed between initial factor score estimates, j> and true scores, 

jk 

S . 
~ck 

The E derived using the initial or hypothesis-generating sample were then 
jk 

applied to the data of the corresponding validation sample , creating four new sets 

A* 

of (cross-validity) factor score estim tes identified as S ■ Z, E., . . RIISs were 

jk k jk' 

calculated for canonical correlations computed between cross-validity estimates, 
A* 

S and true scores , S . . The reader may wish to refer to Figure 1, which is a sche- 
jk ~c k 

matic diagram of the cross-validation procedures described above. 

INSERT FIGURE 1 ABOUT HERE 



The shrinkage between RMS s computed for each method identified as S and 

jk jk 

corresponds to the stability of the estimates based on the initial hypothesis- 

O 
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A /N 

generating sample Z . (Note- when F and U are closer approximations to the corres- 
~~k ~k ~k 

/S /\ 

ponding population F and U than are F 9 and 17 , _R?!Ss computed for estimates based 

Is, 



k 



on weights given by E will be higher than those computed for estimates based on 

Jk 

weights given by E . That is, the (direction of) shrinkage will be positive. If, 

however, £ and U provide the closer approximations to F and U, then it is poss- 

k' 1 k f ~ 

ible to have negative shrinkage. That is, RIIS s computed for estimates bas-d on the 

validation sample weights, E will be higher than those computed for estimates 

Ik' 

based on weights given by the original hypothesis-generating sample. Thus, in the 
sense that negative shrinkage can be observed here v the present study is an uncon- 
ventional cross-validation study) . 

Results 

Several summary statistics based on results from each of the comparisons noted 
above, are presented below for each method of estimation and for each selection of 
£ and H, and for all values of m/p, the ratio of the number of factors to the num- 
ber of variables for the respective solution. These statistics allow the reader 
readily to examine for himself the quality of estimates given by the different me- 
thods with respect to the factor score criteria of this study. 

For each estimation method and for each combination of F and N, the following 

summary statistics are given" (1) the average zero-order correlation (r ) between 

te 

the true and estimated factor scores] (2) the average canonical root mean square 
( RMS ) a which is included as a sunniary index of predictive validity^ and (3) the av- 
erage residual between the cancaical root mean square ( RI1S ) computed for the hy- 

r 

pothesis-generating and the validation estimates. All averages are simply unweight- 
ed means computed across results for both halves of data. 

Tables 2-5 include these summary statistics. Each table contains statistics 
for a single estimation method. The reader should recognize that a single (small 
value) selection of m was always used when N » 200. The final (row) entries in each 
table represent (unweighted) averages for each statistic, computed across all the 
© nitial Fs included. Although different combinations of row complexities, (sizes 
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and variabilities of) comraunalities and ratios of factors to variables were repre- 
sented, it seems reasonable to summarize results using these composite statistics. 
For all these tables, there are no entries for the Browne (1968) data with N « 200 
and m « 3, for the Overall and Porterfield (1953) dat a with N 83 200 and m a 3, and 

for the entire Conry (1965) data. For no clearly apparent reason, the s, canon- 

te 

ical Ks and root mean squares computed for these data were extremely small. Thus, 
these data were censored for this summary in view of the writer's belief that these 
data sets are not generally comparable with the resr. 

The data of tables 2,3 V and A indicate that the Horst, Bartlett and Anderson 
and Rubin methods gave what may be judged as satisfactory results for each combina- 
tion of F and u with respect to the estimates approximating associated true scores 
and with respect to cross- validational shrinkage. Inspection of tables 2-5 indi- 
cates chat there were only small differences anong these summary statistics given 
for !! - 200 and H - 300, when using the sam^ ratio of factors to variables. 

Perhaps the most striking observation froir these tables is the degree of sim- 
ilarity of results for the first three methods of factor score estimation and the 
finding of essentially no shrinkage for these methods. Had the ratio of sample to 
population size, n/N been smaller 3 the probability of negative shrinkage wculd no 
doubt have been lowered; that negative shrinkage was occasionally found for sam- 
pling ratios of 1/2 clearly does not imply that opposite sample Fs will produce 
higher cross- sample validities in general. Nevertheless, it at least seems reasonable 
to suggest that score estimates based on the first three methods are apt to be rel- 
atively stable for many applications. 

Another finding which is mnnifas^ from a study of the summaiy tables is that 
average jr s or RHS s for the individual Fs are quite highly correlated with the av- 

tC! 

erage commonalities for these IPs. This is of course not surprising. The one point 

that is interesting, however, is that jr s are greate than .70 despite the fact 

te 

that the smallest average communalities are about .50. Highest levels of i: , about 




")0 were reached for the JF matrix from the work of Wiggins and Lovell (1965) , the 
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average communal! ty for which was .69. 

One fact that was initially unsettling is that, depending on the population F, 
the RIISs sometimes increased and sometimes decreased as the ratio, m/p was increas- 
ed. For the Browne, Overall and Porterfield and Uiggins and Lovell Fs, it is seen 
that RIIS s increase markedly as m/p is increased. But for the Bechtoldt (1961) and 
the Emmett (1949) data, the opposite effect occurs; as the half-sample m/p ratio is 
increased to the population ratio, the RMSs decrease. The reason apparently, is 
that for the latter t^o data sets (Bechtoldt and Eramett) , population Fs have rel- 
atively lower complexity rows for an orthogonal solution than do the three former 
sets. It is suggested that when the sample ratios m/p were set to be smaller than 
the population ratio f or the complex Fs of Browne, Overall and Porterfield and 
Uiggins and Lovell data, the maximum likelihood factoring procedure resulted in de- 
rived half-sample Fs whose columns may not have clearly matched any of those of the 
associated population ?j that the half-sample Fs were instead "stretched 11 across 
the true common space. As the sample ratio m/p was increased to the population 
ratio, for the more complex data, the individual factors tended better to match the 
population factors, thus the RIISs reached th^ir highest values for largest m/p 
ratios for these data. Perhaps the conclusion that one ought to make in this context 
is that RIIS s associated with the largest values of m/p are the ones which ought to 
be given primary attention for interpretation of all methods. When one compares the 
four methods using these rows alone* however, the same general conclusions are 
reached about the relative merits of these four methods of estimation. This can be 
seen by inspecting the final row entries of each of these tables. 

As must be true based on analytical study, the average root mean square 
statistic (RMS ) was identical for both the Horst and the Horn methods and for both 
the Bartlett and Anderson and Rubin methods, across the different specifications 
of I? included in these tables. However , as can be seen from tables 2 and 5 f the 
true- estimated correlations were distinctly lower for the Horn method than for the 
->4orst method within each half sample for each specification of £ and N. 
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Despite the fact that the Horn method appeared to give what may be judged as 
satisfactory true-estimated correlations and canonical root mean squares, the cross- 
validations! procedures showed this method to yield highly unstable estimates. As 
can be seen from table 5, the cross-validational shrinkage was substantially higher 
for this method across all specifications of £ and N, than the shrinkage for the 
other three methods. Recalling information from table 5, the extent of shrinkage was 
generally low for the first jt larger factors s but extremely high for the subsequent 
factors 9 when using what has been termed the Horn method. 

No noticeable patterns in the results given by the four estimation methods were 
observed between Ns of 200 and 300 3 across all specifications of initial population 
Fs. Apparently, the difference in size of N was not large enough to produce a no- 
ticeable effect for these d3ta. 

INSERT TABLES 2-5 ABOUT HERE 

Conclusions 

Based on the results presented above 9 the following general conclusions may be 
drawn. The first three methods of this study (Horst, Bartlett and Anderson and Rubin) 
gave practically identical results with respect to approximating true factor scores 
and with respect to cross-validational shrinkage. Each of these methods gave what 
may be judged as satisfactory results for these (artificial) populations of data. It 
has been shown that for general applications 9 it is not unreasonable to expect vec- 
tors of estimated faccor scores based on any of the first three methods of this study 
to correlate upwards of .70 with underlying true factor scores when average commun- 
alities for thr. initial population Fs are above .50. 

Despite the fact tha^. the crude method attributed to Horn appeared to give 
satisfactory within sample trua-estimated correlations and canonical root mean 
squares, the cross-validational procedures of this study showed this method to yield 
highly unstable estimates. The conclusion here* is that the method of selecting 
columns of F to be used as weights for estimating factor measurements cannot be re- 
J C amended for general applications since this procedure consistently provided highly 
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unstable estimates. 

Extensions which might be considered in future studies of the present type in- 
clude the following: A single cross-validation sample pair (Z and Z ) was analyzed 

a b 

for each combination of £ and N. In this study , the objective was to examine a wide 
variety of _Fs and selections of m and N, on the assumption that point estimates of 
validity coefficients would suffice for the inter-method comparisons of the diff- 
erent methods for estimating factor scores. Studies which included several cross- 
validation sample pairs in the case of each population of data, would make it poss- 
ible to generate approximations to the distributions of validities and shrinkage for 
each of several estimation methods across all selections of JF and II. 

To provide a greater opportunity to observe the effect of sample size on the 
quality of estimates given by the four respective methods s future studies might in- 
clude a wider variety of Ns, for example, N 85 200, 500, 700, and 1000. A larger set 
of Ns would also make it possible to generate cross-validation samples whose sac- 
pie sizes were some fraction of the initial population size other than one half. For 
each of the Ns in this expanded set? the analyses of future studies probably should 
include only those selections of m that are equal to the number of factors for the 
associated population F\ 

A 

It might be interesting to include other common factor solutions to derive F 

and I? for the different specifications of £ and H« Normal varimax was used to ob- 

b A. >\ 

tain the orthogonally transformed versions of F and F in the case of each sample 

a b 

of data. This transformation algorithm had also been used to derive several of the 
initial population ]?s. Oblique transformations (see Harris and Kaiser, 1964; 
Hofmann, 1970) perhaps ought to be investigated in future studies of this kind. 

Of course, further variations on the present theme could take on many forms. 
Multivariate analysis typically involves the estimation of so many parameters that 
one cannot in a single study vary all relevant dimensions of parameter investaga- 
tiono While analytical studies are clearly essential for methodological progress, 
O tudies of the present variety appear to have considerable value for refining know- 
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ledge and for making judgments about practical uses of quantitative methods. 
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TABLE 1 



Sources for Hat rices Which Were 
Used in Construction of Population Data 



Source for F X s m/p 





h2 


h2 




Bechtoldt (1961) 


.661 


.158 


6/17 


Browne (1968) 


.472 


.29°) 


4/12 


Conry (1965) 


.657 


.130 


6/17 


Eranett (1949) 


.639 


.140 


4/9 


Harman (1967) 


.500 


,140 


4/20 


Maxwell (1961) 


.535 


.230 


4/10 


Overall-Porter field (1963) 


.673 


.080 


5/15 


Ulggins-Lovell (1965) 


.694 


.110 


3/13 



TABLE 2 



Summary Ststistics for the Horst (1965) 
Classical Least-Squares Method of Estimation 



F 


m/p 


r 

te 


R/IS 
N - 200 


RMS 

r 


r 

te 


RMS RMS 

r 

?! - 300 


Bech- 
toldt 


4/17 
6/17 


.893 


.900 


.004 


.880 
.883 


.897 
.895 


-.002 
-.015 


Browne 


3/12 
4/12 








.822 
.798 


.814 
.851 


.003 
.000 


Emmet t 


2/9 
3/9 
4/9 


-847 


.856 


-.00" 


.848 
.801 
.622 


.854 

.828 


-.002 
-.008 

— . UUu 


Karman 


2/20 
3/20 
4/20 


.775 


.782 


-.001 


.787 
.706 
.818 


.794 
.765 
.843 


.002 
.001 
-.001 


Maxwell 


3/10 
4/10 


.736 


.780 


-.006 


.762 
.740 


.798 
.308 


.007 
-.010 


Overall- 
Porter. 


3/15 
5/15 








.828 
. SC5 


.856 

.,923 


.002 
.001 


Wiggins- 
Lovell 


2/13 
1 3/13 
1 


.874 


.384 


-.001 


.888 
.932 


.892 
.936 


-.002 
.000 


Average 




.825 


.340 


-.002 


; .814 
(.814) 


.852 
(.870) 


-.002 
(-.005) 



! T ote - Entries in parentheses are those averages ? ,mputed for selections of the 
sample ratio, m/p, that was equal to the ratio, m/p, for the respective 
population J\ 
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TABLE 3 



Summary Statistics for the Bartlett 
(1937) Method of Estimation 



F 

i 

» 
i 

i 


m/p 


r 

te 


RMS 
N ■» 200 


RI1S 

r 




r 

te 


RMS 
N - 300 


RMS 

r 


Bech- 
toldt 


4/17 
6/17 


.893 


.901 


.005 




.891 
.855 


.898 
.870 


.008 
-.006 


Browne 


3/12 
4/12 










.772 
.812 


.816 
.825 


.007 
.007 


Emmet t 


2/9 
3/9 
4/9 


.853 


.860 


-.002 




.854 
.792 
.586 


.860 
.824 
.723 


-.002 
.001 
.008 


Harman 


2/20 
3/20 
4/20 


.791 


.800 


-.003 




.762 
.686 
.817 


.786 
.767 
.841 


.005 
-.004 
-.004 


Maxwell 


3/10 
4/10 


.724 


.768 


-.012 




.754 
.726 


.779 
.806 


-.001 
-.016 


Overall- 
Porter. 


3/15 
5/15 










• O JO 

.905 


• Qui 

.924 


— • uux 
.000 


VJiggins- 
Lovell 


2/13 
, 3/13 


.879 


.888 


-.003 




.898 
.935 


.904 
.937 


-.004 
-.002 


Average 
.. _ 




.823 


.843 


-.002 


• 
t 


.805 
(.806) 


.839 
(.846) ( 


.000 
-.002) 
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TABLE 5 



Summary Statistics for the Anderson and Rubin 
(1956) Method of Estimation 



m/p 



Bech- 
toldt 

Browne 



Emmet t 



Harman 



Maxwell 



Overall- j 
Porter. 

Uiggins- 
Lovell 



Average 



4/17 
6/17 

3/12 
4/12 

2/9 
3/9 
4/9 

2/20 
3/20 
4/20 

3/10 
4/10 

3/15 
5/15 

2/13 
3/13 



te 



.898 



.814 



.790 



,734 



.680 



,823 



RMS 
K = 200 



RMS 



901 



.860 



.888 



.005 



-.002 



.800 -.003 



.768 -.012 



-.003 



.842 -.003 



r 


RMS 


RMS 


te 




r 




N - 300 




DOC 
• 07J 


POP 
.070 


. 008 


.864 


.870 


-.006 


• oU / 


Ol £ 
• OlO 


.007 


.816 


.825 


.007 


. 856 


.860 


-.002 


.805 


.824 


.000 


.556 


.723 


.008 


.741 


.786 


.005 


.695 


.767 


-.004 


.826 


.841 


-.004 


.767 


.790 


.010 


.751 


.806 


-.016 


.838 


.861 


-.001 


.907 


.924 


.000 


• C98 


.904 


-.005 


.935 


.937 


-.002 


.310 


.840 


.000 


.808) 


(.846) ( 


-.002) 
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TABLE 5 



Sunmary Statistics for the Horn (1265) 
Method of Estimation 



i 

F 

: 

I 


n/p 


r 

te 


- 

RMS 
H » 200 


_ 

RMS 

r 


r 

te 


RMS 
W - 300 


RMS 

r 


1 

Bech- 


4/17 


. 744 


. 900 


. 372 


.748 


.897 


. 208 


toldt 


6/17 








.6S0 


.895 


.228 


Browne 


j \2 








.031 


.814 


. 272 




4/12 








.785 


.852 


.309 


Emmett 


2/9 


.743 


.856 


.279 


.743 


.371 


.294 




3/9 








.706 


.835 


.285 


Harman 


2/20 


.607 


.782 


.091 


.648 


.794 


.196 




3/20 








.619 


.765 


* 




4/20 








.668 


.043 


.328 


Maxwel 1 


3/10 


.712 


.780 


.209 


.710 


.798 


.197 




4/10 








.666 


.808 


.284 


Overall- 


3/15 








.020 


.856 


.099 


Porter . 


5/15 








.876 


.923 


.181 


Wiggins- 


2/13 


.861 


.804 


* 


.872 


.892 


* 


Love 11 


3/13 








.894 


.936 


.206 


Average 




.733 


.040 


.237 


.764 
1 (.754) 


.852 
(.870} 


.223 
(.231) 



Note - Asterisk identifies those cases where the RMS was not calculated for the 
validation estimates and therefore no residual was found. 
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